智能论文笔记

TAFA: Design Automation of Analog Mixed-Signal FIR Filters Using Time Approximation Architecture

Shiyu Su , Qiaochu Zhang , Juzheng Liu , Mohsen Hassanpourghadi , Rezwan Rasul , Mike Shuo-Wei Chen

分类：机器学习

2021-12-15

由于数字电路的成熟CAD支持，一种数字有限脉冲响应（FIR）滤波器设计是完全可合成的。相反，模拟混合信号（AMS）滤波器设计主要是手动过程，包括架构选择，原理图设计和布局。这项工作提出了一种系统设计方法，可以使用没有任何可调谐无源组件的时间近似架构自动化AMS FIR滤波器设计，例如开关电容器或电阻器。它不仅提高了过滤器的灵活性，而且还促进了模拟复杂性降低的设计自动化。所提出的设计流程具有混合近似方案，根据时间量化效果自动优化过滤器的脉冲响应，这表明了具有最小设计者在循环中的努力的显着性能改进。另外，基于人工神经网络（ANN）的布局感知回归模型与基于梯度的搜索算法结合使用，用于自动化和加快滤波器设计。通过拟议的框架，我们展示了在65nm过程中快速合成了来自规范到布局的过程中的AMS FIR滤波器。

translated by 谷歌翻译

Analog/Mixed-Signal Circuit Synthesis Enabled by the Advancements of Circuit Architectures and Machine Learning Algorithms

Shiyu Su , Qiaochu Zhang , Mohsen Hassanpourghadi , Juzheng Liu , Rezwan A Rasul , Mike Shuo-Wei Chen

分类：机器学习

2021-12-15

由于技术缩放和更高的灵活性/可重构性需求，模拟混合信号（AMS）电路架构已经发展到更加数字友好。同时，由于优化电路尺寸，布局和验证复杂AMS电路的必要性，AMS电路的设计复杂性和成本基本上增加。另一方面，在过去十年中，机器学习（ML）算法受到指数增长，并由电子设计自动化（EDA）社区积极利用。本文将确定这一趋势所带来的机遇和挑战，并概述了几个新兴AMS设计方法，这些方法是最近的AMS电路架构和机器学习算法的演变。具体而言，我们将专注于使用基于神经网络的代理模型来加快电路设计参数搜索和布局迭代。最后，我们将展示从规范到硅原型的若干AMS电路实例的快速合成，具有显着降低的人为干预。

translated by 谷歌翻译

Asking Clarification Questions for Code Generation in General-Purpose Programming Language

Haau-Sing Li , Mohsen Mesgar , André F. T. Martins , Iryna Gurevych

分类：自然语言处理

2022-12-19

Code generation from text requires understanding the user's intent from a natural language description (NLD) and generating an executable program code snippet that satisfies this intent. While recent pretrained language models (PLMs) demonstrate remarkable performance for this task, these models fail when the given NLD is ambiguous due to the lack of enough specifications for generating a high-quality code snippet. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that ambiguities in the specifications of an NLD are resolved by asking clarification questions (CQs). Therefore, we collect and introduce a new dataset named CodeClarQA containing NLD-Code pairs with created CQAs. We evaluate the performance of PLMs for code generation on our dataset. The empirical results support our hypothesis that clarifications result in more precise generated code, as shown by an improvement of 17.52 in BLEU, 12.72 in CodeBLEU, and 7.7\% in the exact match. Alongside this, our task and dataset introduce new challenges to the community, including when and what CQs should be asked.

translated by 谷歌翻译

GAN-based Tabular Data Generator for Constructing Synopsis in Approximate Query Processing: Challenges and Solutions

Mohammadali Fallahian , Mohsen Dorodchi , Kyle Kreth

分类：机器学习

2022-12-18

In data-driven systems, data exploration is imperative for making real-time decisions. However, big data is stored in massive databases that are difficult to retrieve. Approximate Query Processing (AQP) is a technique for providing approximate answers to aggregate queries based on a summary of the data (synopsis) that closely replicates the behavior of the actual data, which can be useful where an approximate answer to the queries would be acceptable in a fraction of the real execution time. In this paper, we discuss the use of Generative Adversarial Networks (GANs) for generating tabular data that can be employed in AQP for synopsis construction. We first discuss the challenges associated with constructing synopses in relational databases and then introduce solutions to those challenges. Following that, we organized statistical metrics to evaluate the quality of the generated synopses. We conclude that tabular data complexity makes it difficult for algorithms to understand relational database semantics during training, and improved versions of tabular GANs are capable of constructing synopses to revolutionize data-driven decision-making systems.

translated by 谷歌翻译

Fast Learning of Multidimensional Hawkes Processes via Frank-Wolfe

Renbo Zhao , Niccolò Dalmasso , Mohsen Ghassemi , Vamsi K. Potluru , Tucker Balch , Manuela Veloso

分类：机器学习

2022-12-12

Hawkes processes have recently risen to the forefront of tools when it comes to modeling and generating sequential events data. Multidimensional Hawkes processes model both the self and cross-excitation between different types of events and have been applied successfully in various domain such as finance, epidemiology and personalized recommendations, among others. In this work we present an adaptation of the Frank-Wolfe algorithm for learning multidimensional Hawkes processes. Experimental results show that our approach has better or on par accuracy in terms of parameter estimation than other first order methods, while enjoying a significantly faster runtime.

translated by 谷歌翻译

Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs

Osman Ülger , Julian Wiederer , Mohsen Ghafoorian , Vasileios Belagiannis , Pascal Mettes

分类：计算机视觉

2022-12-06

Graph neural networks have shown to learn effective node representations, enabling node-, link-, and graph-level inference. Conventional graph networks assume static relations between nodes, while relations between entities in a video often evolve over time, with nodes entering and exiting dynamically. In such temporally-dynamic graphs, a core problem is inferring the future state of spatio-temporal edges, which can constitute multiple types of relations. To address this problem, we propose MTD-GNN, a graph network for predicting temporally-dynamic edges for multiple types of relations. We propose a factorized spatio-temporal graph attention layer to learn dynamic node representations and present a multi-task edge prediction loss that models multiple relations simultaneously. The proposed architecture operates on top of scene graphs that we obtain from videos through object detection and spatio-temporal linking. Experimental evaluations on ActionGenome and CLEVRER show that modeling multiple relations in our temporally-dynamic graph network can be mutually beneficial, outperforming existing static and spatio-temporal graph neural networks, as well as state-of-the-art predicate classification methods.

translated by 谷歌翻译

Longest Common Substring in Longest Common Subsequence's Solution Service: A Novel Hyper-Heuristic

Alireza Abdi , Masih Hajsaeedi , Mohsen Hooshmand

分类：人工智能

2022-12-03

The Longest Common Subsequence (LCS) is the problem of finding a subsequence among a set of strings that has two properties of being common to all and is the longest. The LCS has applications in computational biology and text editing, among many others. Due to the NP-hardness of the general longest common subsequence, numerous heuristic algorithms and solvers have been proposed to give the best possible solution for different sets of strings. None of them has the best performance for all types of sets. In addition, there is no method to specify the type of a given set of strings. Besides that, the available hyper-heuristic is not efficient and fast enough to solve this problem in real-world applications. This paper proposes a novel hyper-heuristic to solve the longest common subsequence problem using a novel criterion to classify a set of strings based on their similarity. To do this, we offer a general stochastic framework to identify the type of a given set of strings. Following that, we introduce the set similarity dichotomizer ($S^2D$) algorithm based on the framework that divides the type of sets into two. This algorithm is introduced for the first time in this paper and opens a new way to go beyond the current LCS solvers. Then, we present a novel hyper-heuristic that exploits the $S^2D$ and one of the internal properties of the set to choose the best matching heuristic among a set of heuristics. We compare the results on benchmark datasets with the best heuristics and hyper-heuristics. The results show a higher performance of our proposed hyper-heuristic in both quality of solutions and run time factors.

translated by 谷歌翻译

Warmup and Transfer Knowledge-Based Federated Learning Approach for IoT Continuous Authentication

Mohamad Wazzeh , Hakima Ould-Slimane , Chamseddine Talhi , Azzam Mourad , Mohsen Guizani

分类：机器学习 | 人工智能

2022-11-10

Continuous behavioural authentication methods add a unique layer of security by allowing individuals to verify their unique identity when accessing a device. Maintaining session authenticity is now feasible by monitoring users' behaviour while interacting with a mobile or Internet of Things (IoT) device, making credential theft and session hijacking ineffective. Such a technique is made possible by integrating the power of artificial intelligence and Machine Learning (ML). Most of the literature focuses on training machine learning for the user by transmitting their data to an external server, subject to private user data exposure to threats. In this paper, we propose a novel Federated Learning (FL) approach that protects the anonymity of user data and maintains the security of his data. We present a warmup approach that provides a significant accuracy increase. In addition, we leverage the transfer learning technique based on feature extraction to boost the models' performance. Our extensive experiments based on four datasets: MNIST, FEMNIST, CIFAR-10 and UMDAA-02-FD, show a significant increase in user authentication accuracy while maintaining user privacy and data security.

translated by 谷歌翻译

BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

Mohsen Fayyaz , Ehsan Aghazadeh , Ali Modarressi , Mohammad Taher Pilehvar , Yadollah Yaghoobzadeh , Samira Ebrahimi Kahou

分类：自然语言处理

2022-11-10

Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based scoring metrics for finding important examples are GraNd and its estimated version, EL2N. In this work, we employ these two metrics for the first time in NLP. We demonstrate that these metrics need to be computed after at least one epoch of fine-tuning and they are not reliable in early steps. Furthermore, we show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it. This paper details adjustments and implementation choices which enable GraNd and EL2N to be applied to NLP.

translated by 谷歌翻译

ON-DEMAND-FL: A Dynamic and Efficient Multi-Criteria Federated Learning Client Deployment Scheme

Mario Chahoud , Hani Sami , Azzam Mourad , Safa Otoum , Hadi Otrok , Jamal Bentahar , Mohsen Guizani

分类：人工智能 | 机器学习

2022-11-05

In this paper, we increase the availability and integration of devices in the learning process to enhance the convergence of federated learning (FL) models. To address the issue of having all the data in one location, federated learning, which maintains the ability to learn over decentralized data sets, combines privacy and technology. Until the model converges, the server combines the updated weights obtained from each dataset over a number of rounds. The majority of the literature suggested client selection techniques to accelerate convergence and boost accuracy. However, none of the existing proposals have focused on the flexibility to deploy and select clients as needed, wherever and whenever that may be. Due to the extremely dynamic surroundings, some devices are actually not available to serve as clients in FL, which affects the availability of data for learning and the applicability of the existing solution for client selection. In this paper, we address the aforementioned limitations by introducing an On-Demand-FL, a client deployment approach for FL, offering more volume and heterogeneity of data in the learning process. We make use of the containerization technology such as Docker to build efficient environments using IoT and mobile devices serving as volunteers. Furthermore, Kubernetes is used for orchestration. The Genetic algorithm (GA) is used to solve the multi-objective optimization problem due to its evolutionary strategy. The performed experiments using the Mobile Data Challenge (MDC) dataset and the Localfed framework illustrate the relevance of the proposed approach and the efficiency of the on-the-fly deployment of clients whenever and wherever needed with less discarded rounds and more available data.

translated by 谷歌翻译